Stream-based Parallel Computing Methodology and Development Environment for High Performance Manycore Accelerators

نویسندگان

  • Shinichi Yamagiwa
  • Gabriel Falcao
  • Koichi Wada
  • Leonel Sousa
چکیده

The latest supercomputers incorporate a high number of compute units under the form of manycore accelerators. Such accelerators, like GPUs, have integrated processors where a massively high number of threads, in the order of thousands, execute concurrently. Compared to single-CPU throughput performance, they offer higher levels of parallelism. Therefore, they represent an indispensable technology in the new era ∗E-mail address: [email protected] †E-mail address: [email protected] ‡E-mail address: [email protected] §E-mail address: [email protected] 2 S. Yamagiwa, G. Falcao, K. Wada and L. Sousa of high performance supercomputing. However, the accelerator is equipped via the peripheral bus of the host CPU, which inevitably creates communication overheads when exchanging programs and data between the CPU and the accelerator. Also, we need to develop both programs to run on these processors that have distinct architectures. To make it simpler for the programmer to use the accelerator and exploit its potential throughput performance, this chapter describes the Caravela platform. Caravela provides a simple programming interface that overcomes the difficulty of developing and running parallel kernels not only on single but also on multiple manycore accelerators. This chapter describes parallel programming techniques and methods applied to a variety of research test case scenarios.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compiler transformation of nested loops for general purpose GPUs

Manycore accelerators have the potential to significantly improve performance of scientific applications when offloading computationally intensive program portions to accelerators. Directive-based high-level programming models, such as OpenACC and OpenMP, are used to create applications for accelerators through annotating regions of code meant for offloading. OpenACC is an emerging directive-ba...

متن کامل

Parallel computing using MPI and OpenMP on self-configured platform, UMZHPC.

Parallel computing is a topic of interest for a broad scientific community since it facilitates many time-consuming algorithms in different application domains.In this paper, we introduce a novel platform for parallel computing by using MPI and OpenMP programming languages based on set of networked PCs. UMZHPC is a free Linux-based parallel computing infrastructure that has been developed to cr...

متن کامل

Assessment Methodology for Anomaly-Based Intrusion Detection in Cloud Computing

Cloud computing has become an attractive target for attackers as the mainstream technologies in the cloud, such as the virtualization and multitenancy, permit multiple users to utilize the same physical resource, thereby posing the so-called problem of internal facing security. Moreover, the traditional network-based intrusion detection systems (IDSs) are ineffective to be deployed in the cloud...

متن کامل

A Hardware-Oblivious Optimizer for Data Stream Processing

High throughput and low latency are key requirements for data stream processing. This is achieved typically through different optimizations on software and hardware level, like multithreading and distributed computing. While any concept can be applied to particular systems, their impact on performance and their configuration can differ greatly depending on underlying hardware. Our goal is an op...

متن کامل

Bind: a Partitioned Global Workflow Parallel Programming Model

High Performance Computing is notorious for its long and expensive software development cycle. To address this challenge, we present Bind: a ”partitioned global workflow” parallel programming model for C++ applications that enables quick prototyping and agile development cycles for high performance computing software targeting heterogeneous distributed manycore architectures. We present applica...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015